## **Chapter 1 Exercise Problem**

## Review question.

- Explain the following term
- Explain/discuss the following term: Amdahl's law.
- Explain/discuss the following term: Power Wall
- Explain/discuss the following term: benchmark
- Explain/discuss the following term: throughput, response time, CPU time
- Explain/discuss why MIPS is not a good metric to evaluate performance
- Explain/discuss the following term: compiler
- Explain/discuss the following term: machine language

1.2

- 1.2.1. For a color display using 8 bits for each of the primary colors(red, green, blue) per pixel and with a resolution of 1280 \* 800 pixels, what should be the size (in bytes), of the frame buffer to store a frame?
- 1.2.2 If a computer has a main memory of 2GB, how many frames could it store, assuming the memory contain no other information?
- 1.2.3 If a computer connected to a 1 gigabit Ethernet network needs to send a 256 Kbytes file, how long it would take?

1.3<1.4>Consider three different processors P1, P2 and P3 executing the same instruction set with the clock rates and CPIs given in the following table.

| Processor | Clock rate | CPI |
|-----------|------------|-----|
| P1        | 2 GHz      | 1.5 |
| P2        | 1.5 GHz    | 1.0 |
| P3        | 3 GHz      | 2.5 |

- 1.3.1 Which processor has the highest performance expressed in instructions per second?
- 1.3.2 If the processors each execute a program in 10 seconds, find the number of cycles and the number of instructions.
- 1.3.3 We are trying to reduce the time by 30% but this leads to an increase of 20% in the CPI. What clock rate should we have to get this time reduction?

1.4 Consider two different implementations of the same instruction set architecture. There are four classes of instructions, A, B, C, and D. The clock rate and CPI of each implementation are given in the following table

|    | Clock rate | CPI Class A | CPI Class B | CPI Class C | CPI Class D |
|----|------------|-------------|-------------|-------------|-------------|
| P1 | 1.5 GHz    | 1           | 2           | 3           | 4           |
| P2 | 2 GHz      | 2           | 2           | 2           | 2           |

- 1.4.1 Given a program with 10<sup>6</sup> instructions divided into classes as follows: 10% class A, 20% class B, 50% class C, and 20% class D, which implementation is faster?
- 1.4.2 What is the global CPI for each implementation?

- 1.4.3 Find the clock cycles required in both case.
- 1.8 Suppose we have developed new versions of a processor with the following characteristics.

| Version   | Voltage | Clock rate |
|-----------|---------|------------|
| Version 1 | 5V      | 0.5 Ghz    |
| Version 2 | 3.3V    | 1 Ghz      |

- 1.8.1 [5] <1.5> How much has the capacitive load varied between versions if the dynamic power has been reduced by 10%?
- 1.8.2 [5] <1.5> How much has the dynamic power been reduced if the capacitive load does not change?
- 1.8.3 [10] <1.5>Assuming that the capacitive load of version 2 is 80% the capacitive load of version 1, find the voltage for version 2 if the dynamic power of version 2 is reduced by 40% from version 1.

1.11.

The following table shows manufacturing data for a processor.

| Wafer diameter | Dies per wafer | Defects per unit              | Cost per wafer |
|----------------|----------------|-------------------------------|----------------|
|                |                | area                          |                |
| 15 cm          | 90             | 0.018 defects/cm <sup>2</sup> | 10             |

- 1.11.1 [10] <1.7> Find the yield.
- 1.11.2 [5] <1.7> Find the cost per die.
- 1.11.3 [10] <1.7> If the number of dies per wafer is increased by 10% and the defects per area unit increases by 15%, find the die area and yield.
- 1.14. Section 1.8 cites as a pitfall the utilization of a subset of the performance equation as a performance metric. To illustrate this, consider the following data for the execution of a program in different processors.

| Processor | Clock Rate | СРІ  |
|-----------|------------|------|
| P1        | 4 GHz      | 1.25 |
| P2        | 3 GHz      | 0.75 |

- 1.14.1 [5] <1.8> One usual fallacy is to consider the computer with the largest clock rate as having the largest performance. Check if this is true for Pl and P2.
- 1.14.2 [10] <1.8> another fallacy is to consider that the processor executing the largest number of instructions will need a larger CPU time. Considering that processor Pl is executing a sequence of 10<sup>6</sup> instructions and that the CPI of processor Pl and P2 do not change, determine the number of instructions that P2 can execute in the same time that Pl needs to execute 10<sup>6</sup> instructions.
- 1.14.3 [10] <1.8>A common fallacy is to use MIPS (millions of instructions per second) to compare the performance of two different processors, and consider that the processor with the largest MIPS has the largest performance. Check if this is true for P 1 and P2.

1.15

Another pitfall cited in Section 1.8 is expecting to improve the overall performance of a computer by improving only one aspect of the computer. This might be true, but not always. Consider a computer running programs with CPU times shown in the following table.

| FF list. INT list. L/S list. Branch list. Total time |  | FP instr. | INT instr. | L/S instr. | Branch instr. | Total time |
|------------------------------------------------------|--|-----------|------------|------------|---------------|------------|
|------------------------------------------------------|--|-----------|------------|------------|---------------|------------|

| a. | 35 s | 85 s | 50 s | 30 s | 200 s |
|----|------|------|------|------|-------|
| b. | 50 s | 80 s | 50 s | 30 s | 210 s |

1.15.1 [5] <1.8> How much is the total time reduced if the time for FP operations is reduced by 20%?

1.15.2 [5] <1.8> How much is the time for INT operations reduced if the total time is reduced by 20%?

1.15.3 [5] < 1.8> Can the total time be reduced by 20% by reducing only the time for branch instructions?